Overview of the RUSProfiling PAN at FIRE Track on Cross-genre Gender Identification in Russian

نویسندگان

  • Tatiana Litvinova
  • Francisco M. Rangel Pardo
  • Paolo Rosso
  • Pavel Seredin
  • Olga Litvinova
چکیده

Author profiling consists of predicting some author’s traits (e.g. age, gender, personality) from her writing. After addressing at PAN@CLEF mainly age and gender identification, in this RusProfiling PAN@FIRE track we have addressed the problem of predicting author’s gender in Russian from a cross-genre perspective: given a training set on Twitter, the systems have been evaluated on five different genres (essays, Facebook, Twitter, reviews and texts where the authors imitated the other gender, where the users change their idiostyle). In this paper, we analyse the 22 runs sent by 5 participant teams. The best results (although also the most sparse ones) have been obtained on Facebook.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Winning Approach to Cross-Genre Gender Identification in Russian at RUSProfiling 2017

We present the CIC systems submitted to the 2017 PAN shared task on Cross-Genre Gender Identification in Russian texts (RUSProfiling). We submitted five systems. One of them was based on a statistical approach using only lexical features, and other four on machine-learning techniques using some combinations of genderspecific Russian grammatical features, word and character n-grams, and suffix n...

متن کامل

Cross-genre Gender Identification in Russian Texts Using Topic Modeling Working Note: Team DUBL

In this paper, we describe the results of gender identification from Team DUBL. We used a topic modeling approach for identifying the author’s gender based on his/her written texts. The model was trained on the RusProfiling PAN 2017 Twitter Corpus that contains data in the Russian language. Themodel has been evaluated on texts of other genres, including texts such as letters to a friend, online...

متن کامل

Representation of Target Classes for Text Classification - AMRITA_CEN_NLP@RusProfiling PAN 2017

This working note describes the system we used while participating in RusProfiling PAN 2017 shared task. The objective of the task is to identify the gender trait of the author from the author’s text written in the Russian Language. Taking this as a binary text classification problem, we have experimented to develop a representation scheme for target classes (called class vectors) from the text...

متن کامل

AmritaNLP@PAN-RusProfiling : Author Profiling using Machine Learning Techniques

This paper illustrates work done on "Gender Identi cation in Russian texts (RusPro ling)" shared task, hosted by PAN in conjunction with FIRE 2017. The task is to predict the author’s gender, based on the Twitter data corpus which is in Russian. We will give a brief introduction to the task at hand, elaborate on the data-set provided by the competition organizers, discuss various feature select...

متن کامل

PAN at FIRE: Overview of the PR-SOCO Track on Personality Recognition in SOurce COde

Author profiling consists of predicting some author’s characteristics (e.g. age, gender, personality) from her writing. After addressing at PAN@CLEF mainly age and gender identification, and also personality recognition in Twitter, in this PAN@FIRE track on Personality Recognition from SOurce COde (PR-SOCO) we have addressed the problem of predicting author’s personality traits from her source ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017